Serveur d'exploration Santé et pratique musicale

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis.

Identifieur interne : 001677 ( Main/Exploration ); précédent : 001676; suivant : 001678

Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis.

Auteurs : Cédric Févotte [France] ; Nancy Bertin ; Jean-Louis Durrieu

Source :

RBID : pubmed:18785855

Descripteurs français

English descriptors

Abstract

This letter presents theoretical, algorithmic, and experimental results about nonnegative matrix factorization (NMF) with the Itakura-Saito (IS) divergence. We describe how IS-NMF is underlaid by a well-defined statistical model of superimposed gaussian components and is equivalent to maximum likelihood estimation of variance parameters. This setting can accommodate regularization constraints on the factors through Bayesian priors. In particular, inverse-gamma and gamma Markov chain priors are considered in this work. Estimation can be carried out using a space-alternating generalized expectation-maximization (SAGE) algorithm; this leads to a novel type of NMF algorithm, whose convergence to a stationary point of the IS cost function is guaranteed. We also discuss the links between the IS divergence and other cost functions used in NMF, in particular, the Euclidean distance and the generalized Kullback-Leibler (KL) divergence. As such, we describe how IS-NMF can also be performed using a gradient multiplicative algorithm (a standard algorithm structure in NMF) whose convergence is observed in practice, though not proven. Finally, we report a furnished experimental comparative study of Euclidean-NMF, KL-NMF, and IS-NMF algorithms applied to the power spectrogram of a short piano sequence recorded in real conditions, with various initializations and model orders. Then we show how IS-NMF can successfully be employed for denoising and upmix (mono to stereo conversion) of an original piece of early jazz music. These experiments indicate that IS-NMF correctly captures the semantics of audio and is better suited to the representation of music signals than NMF with the usual Euclidean and KL costs.

DOI: 10.1162/neco.2008.04-08-771
PubMed: 18785855


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis.</title>
<author>
<name sortKey="Fevotte, Cedric" sort="Fevotte, Cedric" uniqKey="Fevotte C" first="Cédric" last="Févotte">Cédric Févotte</name>
<affiliation wicri:level="3">
<nlm:affiliation>CNRS-TELECOM ParisTech, 75014 Paris, France. fevotte@telecom-paristech.fr</nlm:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>CNRS-TELECOM ParisTech, 75014 Paris</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Bertin, Nancy" sort="Bertin, Nancy" uniqKey="Bertin N" first="Nancy" last="Bertin">Nancy Bertin</name>
</author>
<author>
<name sortKey="Durrieu, Jean Louis" sort="Durrieu, Jean Louis" uniqKey="Durrieu J" first="Jean-Louis" last="Durrieu">Jean-Louis Durrieu</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2009">2009</date>
<idno type="RBID">pubmed:18785855</idno>
<idno type="pmid">18785855</idno>
<idno type="doi">10.1162/neco.2008.04-08-771</idno>
<idno type="wicri:Area/Main/Corpus">001754</idno>
<idno type="wicri:explorRef" wicri:stream="Main" wicri:step="Corpus" wicri:corpus="PubMed">001754</idno>
<idno type="wicri:Area/Main/Curation">001754</idno>
<idno type="wicri:explorRef" wicri:stream="Main" wicri:step="Curation">001754</idno>
<idno type="wicri:Area/Main/Exploration">001754</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis.</title>
<author>
<name sortKey="Fevotte, Cedric" sort="Fevotte, Cedric" uniqKey="Fevotte C" first="Cédric" last="Févotte">Cédric Févotte</name>
<affiliation wicri:level="3">
<nlm:affiliation>CNRS-TELECOM ParisTech, 75014 Paris, France. fevotte@telecom-paristech.fr</nlm:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>CNRS-TELECOM ParisTech, 75014 Paris</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Bertin, Nancy" sort="Bertin, Nancy" uniqKey="Bertin N" first="Nancy" last="Bertin">Nancy Bertin</name>
</author>
<author>
<name sortKey="Durrieu, Jean Louis" sort="Durrieu, Jean Louis" uniqKey="Durrieu J" first="Jean-Louis" last="Durrieu">Jean-Louis Durrieu</name>
</author>
</analytic>
<series>
<title level="j">Neural computation</title>
<idno type="ISSN">0899-7667</idno>
<imprint>
<date when="2009" type="published">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Acoustic Stimulation (MeSH)</term>
<term>Algorithms (MeSH)</term>
<term>Humans (MeSH)</term>
<term>Information Storage and Retrieval (MeSH)</term>
<term>Markov Chains (MeSH)</term>
<term>Models, Statistical (MeSH)</term>
<term>Music (MeSH)</term>
<term>Pattern Recognition, Automated (MeSH)</term>
<term>Pitch Perception (physiology)</term>
<term>Sound Spectrography (methods)</term>
<term>Time Factors (MeSH)</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes (MeSH)</term>
<term>Chaines de Markov (MeSH)</term>
<term>Facteurs temps (MeSH)</term>
<term>Humains (MeSH)</term>
<term>Modèles statistiques (MeSH)</term>
<term>Musique (MeSH)</term>
<term>Mémorisation et recherche des informations (MeSH)</term>
<term>Perception de la hauteur tonale (physiologie)</term>
<term>Reconnaissance automatique des formes (MeSH)</term>
<term>Spectrographie sonore (méthodes)</term>
<term>Stimulation acoustique (MeSH)</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Sound Spectrography</term>
</keywords>
<keywords scheme="MESH" qualifier="méthodes" xml:lang="fr">
<term>Spectrographie sonore</term>
</keywords>
<keywords scheme="MESH" qualifier="physiologie" xml:lang="fr">
<term>Perception de la hauteur tonale</term>
</keywords>
<keywords scheme="MESH" qualifier="physiology" xml:lang="en">
<term>Pitch Perception</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Acoustic Stimulation</term>
<term>Algorithms</term>
<term>Humans</term>
<term>Information Storage and Retrieval</term>
<term>Markov Chains</term>
<term>Models, Statistical</term>
<term>Music</term>
<term>Pattern Recognition, Automated</term>
<term>Time Factors</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Chaines de Markov</term>
<term>Facteurs temps</term>
<term>Humains</term>
<term>Modèles statistiques</term>
<term>Musique</term>
<term>Mémorisation et recherche des informations</term>
<term>Reconnaissance automatique des formes</term>
<term>Stimulation acoustique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This letter presents theoretical, algorithmic, and experimental results about nonnegative matrix factorization (NMF) with the Itakura-Saito (IS) divergence. We describe how IS-NMF is underlaid by a well-defined statistical model of superimposed gaussian components and is equivalent to maximum likelihood estimation of variance parameters. This setting can accommodate regularization constraints on the factors through Bayesian priors. In particular, inverse-gamma and gamma Markov chain priors are considered in this work. Estimation can be carried out using a space-alternating generalized expectation-maximization (SAGE) algorithm; this leads to a novel type of NMF algorithm, whose convergence to a stationary point of the IS cost function is guaranteed. We also discuss the links between the IS divergence and other cost functions used in NMF, in particular, the Euclidean distance and the generalized Kullback-Leibler (KL) divergence. As such, we describe how IS-NMF can also be performed using a gradient multiplicative algorithm (a standard algorithm structure in NMF) whose convergence is observed in practice, though not proven. Finally, we report a furnished experimental comparative study of Euclidean-NMF, KL-NMF, and IS-NMF algorithms applied to the power spectrogram of a short piano sequence recorded in real conditions, with various initializations and model orders. Then we show how IS-NMF can successfully be employed for denoising and upmix (mono to stereo conversion) of an original piece of early jazz music. These experiments indicate that IS-NMF correctly captures the semantics of audio and is better suited to the representation of music signals than NMF with the usual Euclidean and KL costs.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" Owner="NLM">
<PMID Version="1">18785855</PMID>
<DateCompleted>
<Year>2009</Year>
<Month>03</Month>
<Day>31</Day>
</DateCompleted>
<DateRevised>
<Year>2009</Year>
<Month>02</Month>
<Day>27</Day>
</DateRevised>
<Article PubModel="Print">
<Journal>
<ISSN IssnType="Print">0899-7667</ISSN>
<JournalIssue CitedMedium="Print">
<Volume>21</Volume>
<Issue>3</Issue>
<PubDate>
<Year>2009</Year>
<Month>Mar</Month>
</PubDate>
</JournalIssue>
<Title>Neural computation</Title>
<ISOAbbreviation>Neural Comput</ISOAbbreviation>
</Journal>
<ArticleTitle>Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis.</ArticleTitle>
<Pagination>
<MedlinePgn>793-830</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1162/neco.2008.04-08-771</ELocationID>
<Abstract>
<AbstractText>This letter presents theoretical, algorithmic, and experimental results about nonnegative matrix factorization (NMF) with the Itakura-Saito (IS) divergence. We describe how IS-NMF is underlaid by a well-defined statistical model of superimposed gaussian components and is equivalent to maximum likelihood estimation of variance parameters. This setting can accommodate regularization constraints on the factors through Bayesian priors. In particular, inverse-gamma and gamma Markov chain priors are considered in this work. Estimation can be carried out using a space-alternating generalized expectation-maximization (SAGE) algorithm; this leads to a novel type of NMF algorithm, whose convergence to a stationary point of the IS cost function is guaranteed. We also discuss the links between the IS divergence and other cost functions used in NMF, in particular, the Euclidean distance and the generalized Kullback-Leibler (KL) divergence. As such, we describe how IS-NMF can also be performed using a gradient multiplicative algorithm (a standard algorithm structure in NMF) whose convergence is observed in practice, though not proven. Finally, we report a furnished experimental comparative study of Euclidean-NMF, KL-NMF, and IS-NMF algorithms applied to the power spectrogram of a short piano sequence recorded in real conditions, with various initializations and model orders. Then we show how IS-NMF can successfully be employed for denoising and upmix (mono to stereo conversion) of an original piece of early jazz music. These experiments indicate that IS-NMF correctly captures the semantics of audio and is better suited to the representation of music signals than NMF with the usual Euclidean and KL costs.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Févotte</LastName>
<ForeName>Cédric</ForeName>
<Initials>C</Initials>
<AffiliationInfo>
<Affiliation>CNRS-TELECOM ParisTech, 75014 Paris, France. fevotte@telecom-paristech.fr</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Bertin</LastName>
<ForeName>Nancy</ForeName>
<Initials>N</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Durrieu</LastName>
<ForeName>Jean-Louis</ForeName>
<Initials>JL</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
</Article>
<MedlineJournalInfo>
<Country>United States</Country>
<MedlineTA>Neural Comput</MedlineTA>
<NlmUniqueID>9426182</NlmUniqueID>
<ISSNLinking>0899-7667</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000161" MajorTopicYN="N">Acoustic Stimulation</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D000465" MajorTopicYN="Y">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016247" MajorTopicYN="N">Information Storage and Retrieval</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D008390" MajorTopicYN="Y">Markov Chains</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D015233" MajorTopicYN="Y">Models, Statistical</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D009146" MajorTopicYN="Y">Music</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D010363" MajorTopicYN="N">Pattern Recognition, Automated</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D010898" MajorTopicYN="N">Pitch Perception</DescriptorName>
<QualifierName UI="Q000502" MajorTopicYN="Y">physiology</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D013018" MajorTopicYN="N">Sound Spectrography</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="N">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D013997" MajorTopicYN="N">Time Factors</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="pubmed">
<Year>2008</Year>
<Month>9</Month>
<Day>13</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2009</Year>
<Month>4</Month>
<Day>1</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2008</Year>
<Month>9</Month>
<Day>13</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">18785855</ArticleId>
<ArticleId IdType="doi">10.1162/neco.2008.04-08-771</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Île-de-France</li>
</region>
<settlement>
<li>Paris</li>
</settlement>
</list>
<tree>
<noCountry>
<name sortKey="Bertin, Nancy" sort="Bertin, Nancy" uniqKey="Bertin N" first="Nancy" last="Bertin">Nancy Bertin</name>
<name sortKey="Durrieu, Jean Louis" sort="Durrieu, Jean Louis" uniqKey="Durrieu J" first="Jean-Louis" last="Durrieu">Jean-Louis Durrieu</name>
</noCountry>
<country name="France">
<region name="Île-de-France">
<name sortKey="Fevotte, Cedric" sort="Fevotte, Cedric" uniqKey="Fevotte C" first="Cédric" last="Févotte">Cédric Févotte</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/SanteMusiqueV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001677 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001677 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    SanteMusiqueV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     pubmed:18785855
   |texte=   Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:18785855" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a SanteMusiqueV1 

Wicri

This area was generated with Dilib version V0.6.38.
Data generation: Mon Mar 8 15:23:44 2021. Site generation: Mon Mar 8 15:23:58 2021